Discourse Complements Lexical Semantics for Non-factoid Answer Reranking
نویسندگان
چکیده
We propose a robust answer reranking model for non-factoid questions that integrates lexical semantics with discourse information, driven by two representations of discourse: a shallow representation centered around discourse markers, and a deep one based on Rhetorical Structure Theory. We evaluate the proposed model on two corpora from different genres and domains: one from Yahoo! Answers and one from the biology domain, and two types of non-factoid questions: manner and reason. We experimentally demonstrate that the discourse structure of nonfactoid answers provides information that is complementary to lexical semantic similarity between question and answer, improving performance up to 24% (relative) over a state-of-the-art model that exploits lexical semantic similarity alone. We further demonstrate excellent domain transfer of discourse information, suggesting these discourse features have general utility to non-factoid question answering.
منابع مشابه
If You Can't Beat Them Join Them: Handcrafted Features Complement Neural Nets for Non-Factoid Answer Reranking
We show that a neural approach to the task of non-factoid answer reranking can benefit from the inclusion of tried-and-tested handcrafted features. We present a novel neural network architecture based on a combination of recurrent neural networks that are used to encode questions and answers, and a multilayer perceptron. We show how this approach can be combined with additional features, in par...
متن کاملHigher-order Lexical Semantic Models for Non-factoid Answer Reranking
Lexical semantic models provide robust performance for question answering, but, in general, can only capitalize on direct evidence seen during training. For example, monolingual alignment models acquire term alignment probabilities from semistructured data such as question-answer pairs; neural network language models learn term embeddings from unstructured text. All this knowledge is then used ...
متن کاملThe Pronto QA System at TREC 2007: Harvesting Hyponyms, Using Nominalisation Patterns, and Computing Answer Cardinality
The backbone of the Pronto QA system is linguistically-principled: Combinatory Categorial Grammar is used to generate syntactic analyses of questions and potential answer snippets, and Discourse Representation Theory is employed as semantic formalism to match the meanings of questions and answers. The key idea of the Pronto system is to use semantics to prune answer candidates, thereby exploiti...
متن کاملSpinning Straw into Gold: Using Free Text to Train Monolingual Alignment Models for Non-factoid Question Answering
Monolingual alignment models have been shown to boost the performance of question answering systems by ”bridging the lexical chasm” between questions and answers. The main limitation of these approaches is that they require semistructured training data in the form of question-answer pairs, which is difficult to obtain in specialized domains or lowresource languages. We propose two inexpensive m...
متن کاملThis is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering
We present a simple yet powerful approach to non-factoid answer reranking whereby question-answer pairs are represented by concatenated distributed representation vectors and a multilayer perceptron is used to compute the score for an answer. Despite its simplicity, our approach achieves state-of-the-art performance on a public dataset of How questions, outperforming systems which employ sophis...
متن کامل